Granular Support Vector Machines Based on Granular Computing, Soft Computing and Statistical Learning
نویسندگان
چکیده
With emergence of biomedical informatics, Web intelligence, and E-business, new challenges are coming for knowledge discovery and data mining modeling problems. In this dissertation work, a framework named Granular Support Vector Machines (GSVM) is proposed to systematically and formally combine statistical learning theory, granular computing theory and soft computing theory to address challenging predictive data modeling problems effectively and/or efficiently, with specific focus on binary classification problems. In general, GSVM works in 3 steps. Step 1 is granulation to build a sequence of information granules from the original dataset or from the original feature space. Step 2 is modeling Support Vector Machines (SVM) in some of these information granules when necessary. Finally, step 3 is aggregation to consolidate information in these granules at suitable abstract level. A good granulation method to find suitable granules is crucial for modeling a good GSVM. Under this framework, many different granulation algorithms including the GSVMCMW (cumulative margin width) algorithm, the GSVM-AR (association rule mining) algorithm, a family of GSVM-RFE (recursive feature elimination) algorithms, the GSVM-DC (data cleaning) algorithm and the GSVM-RU (repetitive undersampling) algorithm are designed for binary classification problems with different characteristics. The empirical studies in biomedical domain and many other application domains demonstrate that the framework is promising. As a preliminary step, this dissertation work will be extended in the future to build a Granular Computing based Predictive Data Modeling framework (GrC-PDM) with which we can create hybrid adaptive intelligent data mining systems for high quality prediction. INDEX WORDS: Data Mining, Machine Learning, Statistical Learning, Computational Intelligence, Granular Computing, Granular Support Vector Machines, Bioinformatics GRANULAR SUPPORT VECTOR MACHINES BASED ON GRANULAR COMPUTING, SOFT COMPUTING AND STATISTICAL LEARNING
منابع مشابه
INTERVAL ANALYSIS-BASED HYPERBOX GRANULAR COMPUTING CLASSIFICATION ALGORITHMS
Representation of a granule, relation and operation between two granules are mainly researched in granular computing. Hyperbox granular computing classification algorithms (HBGrC) are proposed based on interval analysis. Firstly, a granule is represented as the hyperbox which is the Cartesian product of $N$ intervals for classification in the $N$-dimensional space. Secondly, the relation betwee...
متن کاملA Comparative Study of Extreme Learning Machines and Support Vector Machines in Prediction of Sediment Transport in Open Channels
The limiting velocity in open channels to prevent long-term sedimentation is predicted in this paper using a powerful soft computing technique known as Extreme Learning Machines (ELM). The ELM is a single Layer Feed-forward Neural Network (SLFNN) with a high level of training speed. The dimensionless parameter of limiting velocity which is known as the densimetric Froude number (Fr) is predicte...
متن کاملFuzzy Rough Granular Neural Networks for Pattern Analysis
Granular computing is a computational paradigm in which a granule represents a structure of patterns evolved by performing operations on the individual patterns. Two granular neural networks are described for performing the pattern analysis tasks like classification and clustering. The granular neural networks are designed by integrating fuzzy sets and fuzzy rough sets with artificial neural ne...
متن کاملGranular support vector machine based on mixed measure
This paper presents a granular support vector machine learning model based on mixed measure, namely M_GSVM, to solve the model error problem produced by mapping, simplifying, granulating or substituting of data for traditional granular support vector machines (GSVM). For M_GSVM, the original data will be mapped into the high-dimensional space by mercer kernel. Then, the data are divided into su...
متن کاملGranular support vector machines with association rules mining for protein homology prediction
OBJECTIVE Protein homology prediction between protein sequences is one of critical problems in computational biology. Such a complex classification problem is common in medical or biological information processing applications. How to build a model with superior generalization capability from training samples is an essential issue for mining knowledge to accurately predict/classify unseen new s...
متن کامل